Evaluation of Many-to-Many Alignment Algorithm by Automatic Pronunciation Annotation Using Web Text Mining

نویسندگان

  • Keigo Kubo
  • Hiromichi Kawanami
  • Hiroshi Saruwatari
  • Kiyohiro Shikano
چکیده

The need for robust pronunciation annotation over out-of-vocabulary (OOV) words has been increasing with the development of an application that deals with proper nouns and brand-new words, such as Voice Search. In robust pronunciation annotation over OOV words, the alignment between graphemes and phonemes is vital data. For a many-to-many alignment algorithm between graphemes and phonemes, we describe its problems and methods to overcome them. An evaluation experiment of a many-to-many alignment by automatic pronunciation annotation using Web text mining is also performed. That experimental result shows that the proposed manyto-many alignment produces an alignment that has the high generalization ability for OOV words while avoiding degradation of the accuracy of the pronunciation annotation compared with the conventional approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Discovery of Technology Networks for Industrial-Scale R&D IT Projects via Data Mining

Industrial-Scale R&D IT Projects depend on many sub-technologies which need to be understood and have their risks analysed before the project can begin for their success. When planning such an industrial-scale project, the list of technologies and the associations of these technologies with each other is often complex and form a network. Discovery of this network of technologies is time consumi...

متن کامل

Techniques for accurate automatic annotation of speech waveforms

We describe techniques used in the development of an automatic annotation system for use with a concatenative text-to-speech synthesis system. The goal of the system is to generate automatically from word-level transcriptions annotations that result in synthetic speech of the same quality as that produced from hand-labelled speech. Our approach in this work has been to use the standard techniqu...

متن کامل

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Automatic Service Composition Based on Graph Coloring

Web services as independent software components are published on the Internet by service providers and services are then called by users’ request. However, in many cases, no service alone can be found in the service repository that could satisfy the applicant satisfaction. Service composition provides new components by using an interactive model to accelerate the programs. Prior to service comp...

متن کامل

Automatic Service Composition Based on Graph Coloring

Web services as independent software components are published on the Internet by service providers and services are then called by users’ request. However, in many cases, no service alone can be found in the service repository that could satisfy the applicant satisfaction. Service composition provides new components by using an interactive model to accelerate the programs. Prior to service comp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012